REMD Workflow Automation #36

tlfobe · 2024-02-29T23:52:35Z

This PR was merged with PR #35, but this wasn't fully finished. So I'm reopening this PR as I continue to build out the simulation setup workflow. Here is the PR notes from the original PR

Reason for the PR

I think a lot of friction for getting simulations setup and running are because the force field parameter generation and assignment was setup to run in the examples directories, instead of directly in the simulation workflow. This PR is meant to put the simulation setup, running and analysis all in one place. Additionally, this PR is meant to take a lot of the hand tuning required to setup simulations.

Features

Based on specific parameter YML files parameterization can be handled by different workflows (whether we need to parameterize with OpenFF-2.0 or a Bespoke-Fit FF).
Build system with packmol in a signac workflow. These workflow functions should check for specific software they need to run, or they can be provided in the input files.
Move REMD simulations and MD simulations into a signac workflow that is in analysis_workflows

To Do's

Add workflow function to build foldamer
Add workflow function to solvate system with chloroform (and minimize). This will require the system to have packmol installed (or at least an input option that points to packmol or make it a subpackage)
Add workflow function to build topology file for any system
Add workflow function that sets up and submits REMD simulations
...

tlfobe · 2024-02-29T23:58:37Z

@mattwthompson Do you know the best-practice way to include an OpenEye license for testing with GHAs? I see OpenFF-toolkit has tests with and without OpenEye

mrshirts · 2024-03-01T00:02:23Z

Do you know the best-practice way to include an OpenEye license for testing with GHAs? I see OpenFF-toolkit has tests with and without OpenEye

Yeah, you don't want to include OpenEye in any release package.

…mer systems, next is to build topologies for the entire system

mattwthompson · 2024-03-01T15:23:31Z

In terms of packaging, you want to keep it an optional dependency. Both because your users might not have a license and because you can't bundle it with a released version (licensing details, to my understanding) nor include it as a required dependency, at least with conda-forge infrastructure.

If you want to include it in CI runs, and I'm not totally sure you do, OpenFF's setup is to store a license we're permitted to use in CI (we might have special permission, I don't know the terms of the Shirts license) as an org-level secret that's accessible from actions. (https://github.com/organizations/ORG_NAME/settings/secrets/actions, stored as OE_LICENSE as a secret, which is encrypted and not visible even to us.) This is then accessible in actions and dumped into an encrypted file in a way that's not accessible in logs, and for that matter the secret isn't even accessible when CI is run from a fork.

My $0.02, not knowing what functionality you actually need in automation, is to keep things RDKit/open-source in CI and roll with OpenEye in your local development/production environment. This is how the toolkit is designed to work, including both the high-level API and skipping tests if OpenEye is not installed.

tlfobe · 2024-03-02T17:29:43Z

Okay, sounds good! I was asking because I had a testcase that assigns parameters for a small foldamer that would go faster on an install with OpenEye, but since it's using the OpenFF toolkit it can still run without OpenEye, it just takes a bit longer (~40 minutes). I might just skip that test all together when running tests with GHAs.

mrshirts · 2024-03-02T17:31:17Z

Yeah, I wouldn't worry about the OpenEye interface.

…le molecule and musytem topology files. Created a decorator to cd into specific job directories to minimize repeated cd in cd out in flow operation functions

… to call job.doc[build_parameters][foldamer_structure] every time i need to get a name for a file

…ilder object, which fixes the test cases

…eration into different objects, made some utility gromacs wrapper functions that are python functions that call gromacs with their inputs, working on getting connect records correct so I can use openff to parameterize the entire system, rather than writing my own topology file

…opology generators

… generation

…s/topologies, next is to add topologies to the internally stored structures

…will be to build and parameterize system using openff toolkit and have those topology files be added to the top manager

…dded tests for topology_manager using unittest, remd_workflow now currently works up until running/submitting simulations

…er_tests now navigate to the test directory before running tests, so that files generated from tests are not left in whatever directory you `run pytest` from

…ll there are no existing test.pkl to load when running the tests

…y computer

…get Psi4 bespoke-fit workflow working, currently getting segmentation faults when using default QCSpec object

…ion using remd_workflow

…group/terphenyl_simulations into workflow_parameterization

…chloroform

…rite topology file out

…group/terphenyl_simulations into workflow_parameterization

…instead of running BF workflow

…chemistries and have simulation directories setup for tetramer remd simulations

…s and different import names

…ng in this package

…group/terphenyl_simulations into workflow_parameterization

tlfobe added 2 commits February 29, 2024 16:47

added basic tests for gromacs wrapper and topology builder

b13ac1e

added conda-forge gromacs for simple wrapper tests

bf046c6

have a working gmx wrapper that can run minimizations on single folda…

4eaa011

…mer systems, next is to build topologies for the entire system

tlfobe changed the title ~~Workflow parameterization~~ REMD Workflow Automation Mar 2, 2024

adding workflow functions to parameterize entire box

38ecae7

tlfobe and others added 6 commits March 2, 2024 12:33

made TopologyGenerator a bit more general, so it can handle both sing…

4001757

…le molecule and musytem topology files. Created a decorator to cd into specific job directories to minimize repeated cd in cd out in flow operation functions

added a doc keyword for system and foldamer file prefixes to not have…

676b0dd

… to call job.doc[build_parameters][foldamer_structure] every time i need to get a name for a file

added some logic to give the option of a file prefix in the system bu…

c68f92a

…ilder object, which fixes the test cases

changed tests to reflect most recent changes to the system builders/t…

3ee8bbb

…opology generators

added a topology manager object which stores topologies locally after…

75ff323

… generation

tlfobe self-assigned this May 10, 2024

Theodore Fobe and others added 13 commits August 5, 2024 18:33

added functionality to add structures to local databased of structure…

c5d4d86

…s/topologies, next is to add topologies to the internally stored structures

added topology library to repository with some initial molecule files

9825932

added topology to topology builders

8164900

foldamer parameterization is now aware of the topology manager. Next …

6d7b94e

…will be to build and parameterize system using openff toolkit and have those topology files be added to the top manager

implemented openff workflow for building and parameterizing system, a…

48a3cd3

…dded tests for topology_manager using unittest, remd_workflow now currently works up until running/submitting simulations

fixed test_topology_generator test and made it so that topology_manag…

e7f016a

…er_tests now navigate to the test directory before running tests, so that files generated from tests are not left in whatever directory you `run pytest` from

removed asserts to of loading top_manager from file, in a clean insta…

02c7b32

…ll there are no existing test.pkl to load when running the tests

removed absolute path of top files so that they can be accessed on an…

cecdd6e

…y computer

working bespoke flow parameterization with xtc, need to see if I can …

704a72d

…get Psi4 bespoke-fit workflow working, currently getting segmentation faults when using default QCSpec object

updated submission scripts for ascent allocation, and changed submiss…

9a9029a

…ion using remd_workflow

updated topology manager and simulation filesystem

c925c72

Merge branch 'workflow_parameterization' of https://github.com/shirts…

11f13f0

…group/terphenyl_simulations into workflow_parameterization

updated test_env to include bespokefit and other requirements

9dd7323

tlfobe and others added 26 commits September 17, 2024 12:43

Merge branch 'workflow_parameterization' of https://github.com/shirts…

f9e2589

…group/terphenyl_simulations into workflow_parameterization

updated topology_manager

a4542f8

added openeye to environment file

cfab4bd

merged with remote

4a51137

started generating parameters for other terphenyl chemistry

4eacf38

rebuilding topology manager files

c79a43f

starting to update build files to have the correct smiles string for …

3c176d2

…chloroform

reran some parameterization on my laptop

e556483

added pmp and mom parameters, fixed bespoke parameter assignment to w…

1b05032

…rite topology file out

Merge branch 'workflow_parameterization' of https://github.com/shirts…

e5bd495

…group/terphenyl_simulations into workflow_parameterization

merged with remote

ff676ec

added mop_bespoke tetramer topology

6715cb2

added ff storage in topology manager for offxml files

57c2358

added logic to topology manager to use bepsoke FF if it is available …

15a5a18

…instead of running BF workflow

working simulations on alpine, added some generated topologies

bd5b9a6

merged with remote

458aeb8

fixed topology_manager after merge

a596f9f

added bespoke ff for mom and pom foldamers

f7896fe

merged with remote

30a4f69

added pop bespoke FF

28f0a42

added default openff parameters to topologymanager for all terphenyl …

1360550

…chemistries and have simulation directories setup for tetramer remd simulations

updated topology_manager tests to account for removed print statement…

741813e

…s and different import names

removed 3.9 from testing since I use some function() -> None: formati…

9738d34

…ng in this package

merged with remote

eee5bd3

started updating analysis workflow into new remd workflow file

421d185

Merge branch 'workflow_parameterization' of https://github.com/shirts…

7053e3f

…group/terphenyl_simulations into workflow_parameterization

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REMD Workflow Automation #36

REMD Workflow Automation #36

tlfobe commented Feb 29, 2024 •

edited

Loading

tlfobe commented Feb 29, 2024

mrshirts commented Mar 1, 2024

mattwthompson commented Mar 1, 2024

tlfobe commented Mar 2, 2024

mrshirts commented Mar 2, 2024

REMD Workflow Automation #36

Are you sure you want to change the base?

REMD Workflow Automation #36

Conversation

tlfobe commented Feb 29, 2024 • edited Loading

Reason for the PR

Features

To Do's

tlfobe commented Feb 29, 2024

mrshirts commented Mar 1, 2024

mattwthompson commented Mar 1, 2024

tlfobe commented Mar 2, 2024

mrshirts commented Mar 2, 2024

tlfobe commented Feb 29, 2024 •

edited

Loading